Parametric Treatment of cDNA Microarray Data
نویسنده
چکیده
As each set of microarray data is affected by variations in experimental conditions, appropriate normalization processes are required. Various approaches towards such normalization have been proposed and generally involve adjustments to every pair of the data sets, often between the simultaneously hybridizing R and G probes. Chen et al. [1] introduced a model in which each the signal ratio between the probes is normally distributed; the same concept is basically used in the process of the Stanford Microarray Database [6]. Recently, Yang et al. [4] extended this method by stabilizing the average of the signal ratio, which could be biased based on signal intensity. Such non-parametric methods have been further improved by introduction of a parameter that also maintains the variance in the signal ratios [2, 3]. An alternative and simple parametric method that assumes lognormal distribution of data has also been widely employed. Besides its simple ease in performing the calculations, it is also capable of determining the data z-scores, a possible common unit for data comparisons. However, microarray data often have a skewed distribution, and this low fidelity to the distribution model severely limits the accuracy of the data. In the parametric process, the estimation of background, which is defined as the constant part of additive noise [1], can be a major source of normalization inaccuracies. In most cases, the background is estimated from the area outside the DNA spot of the image data; based on the assumption that the background on a tip is uniform. However, as DNA spots and also the intact surface of the tip can bind free dyes at different densities, such estimations are clearly prone to errors. To check this possibility, an alternative estimation method that is based on signal intensity of control DNA is tested against the conventional image-based method. The distributions of both sets of processed data are investigated using probability plots.
منابع مشابه
A comparison of parametric and nonparametric methods for normalising cDNA microarray data.
Normalisation is an essential first step in the analysis of most cDNA microarray data, to correct for effects arising from imperfections in the technology. Loess smoothing is commonly used to correct for trends in log-ratio data. However, parametric models, such as the additive plus multiplicative variance model, have been preferred for scale normalisation, though the variance structure of micr...
متن کاملMicroarray analysis of gene expression patterns in Arabidopsis seedlings under trehalose, sucrose and sorbitol treatment
Trehalose is the non-reducing alpha-alpha-1, 1-linked glucose disaccharide. The biosynthesisprecursor of trehalose, trehalose-6-phosphate (T6P), is essential for plant development, growth,carbon utilization and alters photosynthetic capacity but its mode of action is not understood. In thecurrent research, 6 days old seedlings of Arabidopsis thaliana (Columbia ecotype) were grown inliquid cultu...
متن کاملIdentification of specific gene expression after exposure to low dose ionizing radiation revealed through integrative analysis of cDNA microarray data and the interactome
Background: Accumulating reports suggest that the biological effects of low- and high- dose ionizing radiation (LDIR and HDIR) are qualitatively different and might cause different effects in human skin. Materials and Methods: To better understand the potential risks of LDIR, we analyzed three cDNA microarray datasets from the Gene Expression Omnibus database. Results: A pathway analysis showed...
متن کاملSemilinear High-Dimensional Model for Normalization of Microarray Data: A Theoretical Analysis and Partial Consistency
Normalization of microarray data is essential for removing experimental biases and revealing meaningful biological results. Motivated by a problem of normalizing microarray data, a semilinear in-slide model (SLIM) has been proposed. To aggregate information from other arrays, SLIM is generalized to account for across-array information, resulting in an even more dynamic semiparametric regression...
متن کامل